Evolutionary Dynamics of Q-Learning over the Sequence Form

نویسندگان

Fabio Panozzo

Nicola Gatti

Marcello Restelli

چکیده

Multi–agent learning is a challenging open task in artificial intelligence. It is known an interesting connection between multi–agent learning algorithms and evolutionary game theory, showing that the learning dynamics of some algorithms can be modeled as replicator dynamics with a mutation term. Inspired by the recent sequence–form replicator dynamics, we develop a new version of theQ–learning algorithm working on the sequence form of an extensive–form game allowing thus an exponential reduction of the dynamics length w.r.t. those of the normal form. The dynamics of the proposed algorithm can be modeled by using the sequence– form replicator dynamics with a mutation term. We show that, although sequence–form and normal–form replicator dynamics are realization equivalent, the Q– learning algorithm applied to the two forms have non– realization equivalent dynamics. Originally from the previous works on evolutionary game theory models form multi–agent learning, we produce an experimental evaluation to show the accuracy of the model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A MODEL FOR EVOLUTIONARY DYNAMICS OF WORDS IN A LANGUAGE

Human language, over its evolutionary history, has emerged as one of the fundamental deﬁning characteristic of the modern man. However, this milestone evolutionary process through natural selection has not left any ’linguistic fossils’ that may enable us to trace back the actual course of development of language and its establishment in human societies. Lacking analytical tools to fathom the cr...

متن کامل

Iranian EFL Learners’ Motivational Fluctuation in Task Performance over Different Timescales

Motivation for learning a new language is both self and time-oriented. The language learner’s motivation experiences gradual fluctuation over time and the view of oneself is different on each timescale of the study. Interaction among different timescales throughout the Second Language Development (SLD) is a novel area of investigation (de Bot, 2015). In order to probe this interactive nature, t...

متن کامل

Reinforcement learning based feedback control of tumor growth by limiting maximum chemo-drug dose using fuzzy logic

In this paper, a model-free reinforcement learning-based controller is designed to extract a treatment protocol because the design of a model-based controller is complex due to the highly nonlinear dynamics of cancer. The Q-learning algorithm is used to develop an optimal controller for cancer chemotherapy drug dosing. In the Q-learning algorithm, each entry of the Q-table is updated using data...

متن کامل

Multicast Routing in Wireless Sensor Networks: A Distributed Reinforcement Learning Approach

Wireless Sensor Networks (WSNs) are consist of independent distributed sensors with storing, processing, sensing and communication capabilities to monitor physical or environmental conditions. There are number of challenges in WSNs because of limitation of battery power, communications, computation and storage space. In the recent years, computational intelligence approaches such as evolutionar...

متن کامل

Relational Databases Query Optimization using Hybrid Evolutionary Algorithm

Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Evolutionary Dynamics of Q-Learning over the Sequence Form

نویسندگان

چکیده

منابع مشابه

A MODEL FOR EVOLUTIONARY DYNAMICS OF WORDS IN A LANGUAGE

Iranian EFL Learners’ Motivational Fluctuation in Task Performance over Different Timescales

Reinforcement learning based feedback control of tumor growth by limiting maximum chemo-drug dose using fuzzy logic

Multicast Routing in Wireless Sensor Networks: A Distributed Reinforcement Learning Approach

Relational Databases Query Optimization using Hybrid Evolutionary Algorithm

عنوان ژورنال:

اشتراک گذاری